Taking the intersection of differentially abundant taxa and graphing them revealed that even when taking the intersection, lung tissue taxa prevalence was spurious. Since DAA was redone to reflect testing based on hypotheses, I am rerunning this code to see if there are differences. Lung tissue taxa prevalence is still spurious so I will be focusing on fecal taxa only.

1 Method

One method of running correlations is using the transformed relative abundance data then using Pearson correlation. I had previously used arcsine transformation and considered the centered-log ratio (CLR) transformation. However, plotting correlations with raw relative abundance may show non-obvious relationships. Per Dr. Shi’s recommendation, I will be using Spearman’s rank correlation (non-parametric) with the raw relative abundance values instead.

There are certain lung tissue taxa of interest for Dr. Ingram: Flavobacteriales, Capnocytophaga and Flavobacterium. In addition, differential immune cell counting data has been added on 2/10/2024.

2 Set-Up

2.1 Load libraries and directories

# rm(list = ls(all = TRUE)) ## Unload all packages if necessary
library(rstatix)
library(ggpubr)
library(pals)
library(RColorBrewer)
library(ggrepel)
library(DT)
library(plotly)
library(parallel)
library(dplyr)
library(tidyr)
library(readr)
library(tibble)
library(stringr)
library(phyloseq)
library(Hmisc)
library(cowplot)
library(ggtext)
rm(list=ls()) # clear environment 

git.dir = file.path("/hpc/group/dmc/Eva/GLP_KO_Microbiome_final") # CHANGE ME 
fig.dir = file.path(git.dir, "figures")

map.file = file.path(git.dir, "Metadata_GLP1.csv")
meta.df = read_csv(map.file, show_col_types = FALSE)

F_hfd_intersect.file = file.path(git.dir, "fecal_deseq_hfd_intersect.csv")
f.hfd.intersect = read_csv(F_hfd_intersect.file, show_col_types = FALSE)

F_nosurgery_intersect.file = file.path(git.dir, "fecal_deseq_nosurgery_intersect.csv")
f.nosurgery.intersect = read_csv(F_nosurgery_intersect.file, show_col_types = FALSE)

F_RA.file = file.path("fecal_relative_abundance.csv")
f.ra = read_csv(F_RA.file, show_col_types = FALSE)

diff.file = file.path(git.dir, "GLP1_study_balf_diff_counts.csv")
diff.df = read_csv(diff.file,col_names = TRUE, show_col_types = FALSE )

glucose.file = file.path(git.dir, "GLP1_study_glucose_calculations.csv")
glucose.df = read_csv(glucose.file, col_names = TRUE, show_col_types = FALSE )

protein.file = file.path(git.dir, "GLP1_study_proteins.csv")
protein.df = read_csv(protein.file,col_names = TRUE, show_col_types = FALSE )

trichrome.file = file.path(git.dir, "GLP1_study_trichrome.csv")
trichrome.df = read_csv(trichrome.file,col_names = TRUE, show_col_types = FALSE )

hist.file = file.path(git.dir, "GLP1_study_histology.csv")
hist.df = read_csv(hist.file, show_col_types = FALSE)

3 Correlations: Tests to run

Based on the graphs of differentially abundant taxa that were both identified by DESeq2 and ANCOM-II, many taxa were spurious. As such, I will only run taxon-health marker correlation with fecal taxa.

In addition, there are a lot of variables to choose from and I expect more information to be available in near future. For now, the variables that I am most interested are: 1. Weight 2. Glucose calculations: AUC, MaxPeak, baseline 3. Serum protein levels except for insulin 4. Immune cells (eosinophils and neutrophils) 5. Histology scores and trichrome data

For simplicity, only fecal microbiome relative abundance on the same day as the health data was collected will be used.

3.1 Clean up metadata for correlations

meta.df %>%
  filter(Sample_type == "Fecal" & Type == "True Sample") %>% 
  select(Label, Mouse, Genotype:Surgery, Week, contains("Weight"))  -> meta.slim

glucose.df %>% select(Mouse, Week, AUC:Glucose_min_0) -> glucose.slim

meta.slim %>% 
  left_join(glucose.slim, by = c("Mouse", "Week")) %>% filter(is.na(Mouse) == FALSE)  -> corr.df

meta.slim %>% filter(Week == 13) %>%
  right_join(diff.df, by = "Mouse") -> diff.meta.df
head(corr.df)
head(diff.meta.df)
meta.slim %>% dim()
## [1] 182  12
f.ra %>% distinct(Sample) %>% nrow()
## [1] 178

3.2 Subset fecal taxa to only those that were differentially abundant

f.hfd.intersect$Group <- "HFD"
f.nosurgery.intersect$Group <- "No Surgery"

rbind(f.hfd.intersect, f.nosurgery.intersect) -> f.intersect
f.intersect
f.ra %>% filter(OTU %in% f.intersect$row) %>% 
  select(Label, OTU, Abundance) %>%
  pivot_wider(names_from = OTU,
              values_from = Abundance) %>% 
  dplyr::mutate(Label = as.character(Label)) -> FullRA.df

FullRA.df %>% head()

3.3 Weight Correlations

corr.df %>% select(Label, Week, Weight_g_wk0:Weight_g_wk13) %>%
  mutate(Weight = case_when(Week == 0 ~ Weight_g_wk0,
                            Week == 10 ~ Weight_g_wk10,
                            Week == 13 ~ Weight_g_wk13)) %>%
  select(Label, Weight) %>% 
  filter(is.na(Weight) != TRUE) %>%
  left_join(FullRA.df, by = "Label") -> Weight.corr.df

head(Weight.corr.df)
dim(Weight.corr.df)
## [1] 178  42

This dataframe structure is such that the recorded weight matches the fecal sample label for the day it was collected.

Weight.corr.df %>% colnames()
##  [1] "Label"                                
##  [2] "Weight"                               
##  [3] "Bifidobacterium"                      
##  [4] "Romboutsia"                           
##  [5] "Lactococcus"                          
##  [6] "Rikenella"                            
##  [7] "Prevotellaceae UCG-001"               
##  [8] "Rikenellaceae RC9 gut group"          
##  [9] "Coriobacteriaceae UCG-002"            
## [10] "Prevotellaceae NK3B31 group"          
## [11] "Tyzzerella"                           
## [12] "Parasutterella"                       
## [13] "Lachnospiraceae UCG-001"              
## [14] "Muribaculum"                          
## [15] "Tuzzerella"                           
## [16] "Candidatus Arthromitus"               
## [17] "Acetatifactor"                        
## [18] "Marvinbryantia"                       
## [19] "Butyricicoccus"                       
## [20] "Candidatus Stoquefichus"              
## [21] "Lactobacillaceae"                     
## [22] "Bifidobacteriaceae"                   
## [23] "Peptostreptococcaceae"                
## [24] "Streptococcaceae"                     
## [25] "Prevotellaceae"                       
## [26] "Atopobiaceae"                         
## [27] "Sutterellaceae"                       
## [28] "Enterococcaceae"                      
## [29] "[Eubacterium] coprostanoligenes group"
## [30] "Erysipelatoclostridiaceae"            
## [31] "Bifidobacteriales"                    
## [32] "Peptostreptococcales-Tissierellales"  
## [33] "Oscillospirales"                      
## [34] "Burkholderiales"                      
## [35] "Clostridia UCG-014"                   
## [36] "Actinobacteria"                       
## [37] "Coriobacteriia"                       
## [38] "Gammaproteobacteria"                  
## [39] "Verrucomicrobiae"                     
## [40] "Actinobacteriota"                     
## [41] "Proteobacteria"                       
## [42] "Verrucomicrobiota"
Weight.corr.df %>%
  rstatix::cor_test(vars = Weight,
           vars2 = c("Bifidobacterium":"Verrucomicrobiota"),
           method = "spearman") -> Weight.corr.res 

Weight.corr.res$n <- nrow(Weight.corr.df)
Weight.corr.res$Week <- "0, 10, 13 (aggregated)"

3.4 Glucose tolerance metrics

corr.df %>% filter(is.na(AUC) != TRUE) %>% 
  select(Label, AUC:Glucose_min_0)  %>%
  left_join(FullRA.df, by = "Label") -> glucose.df

glucose.df %>% colnames()
##  [1] "Label"                                
##  [2] "AUC"                                  
##  [3] "MaxPeak"                              
##  [4] "Glucose_min_0"                        
##  [5] "Bifidobacterium"                      
##  [6] "Romboutsia"                           
##  [7] "Lactococcus"                          
##  [8] "Rikenella"                            
##  [9] "Prevotellaceae UCG-001"               
## [10] "Rikenellaceae RC9 gut group"          
## [11] "Coriobacteriaceae UCG-002"            
## [12] "Prevotellaceae NK3B31 group"          
## [13] "Tyzzerella"                           
## [14] "Parasutterella"                       
## [15] "Lachnospiraceae UCG-001"              
## [16] "Muribaculum"                          
## [17] "Tuzzerella"                           
## [18] "Candidatus Arthromitus"               
## [19] "Acetatifactor"                        
## [20] "Marvinbryantia"                       
## [21] "Butyricicoccus"                       
## [22] "Candidatus Stoquefichus"              
## [23] "Lactobacillaceae"                     
## [24] "Bifidobacteriaceae"                   
## [25] "Peptostreptococcaceae"                
## [26] "Streptococcaceae"                     
## [27] "Prevotellaceae"                       
## [28] "Atopobiaceae"                         
## [29] "Sutterellaceae"                       
## [30] "Enterococcaceae"                      
## [31] "[Eubacterium] coprostanoligenes group"
## [32] "Erysipelatoclostridiaceae"            
## [33] "Bifidobacteriales"                    
## [34] "Peptostreptococcales-Tissierellales"  
## [35] "Oscillospirales"                      
## [36] "Burkholderiales"                      
## [37] "Clostridia UCG-014"                   
## [38] "Actinobacteria"                       
## [39] "Coriobacteriia"                       
## [40] "Gammaproteobacteria"                  
## [41] "Verrucomicrobiae"                     
## [42] "Actinobacteriota"                     
## [43] "Proteobacteria"                       
## [44] "Verrucomicrobiota"
glucose.df %>%
  rstatix::cor_test(vars = c(AUC, MaxPeak, Glucose_min_0),
           vars2 = c("Bifidobacterium":"Verrucomicrobiota"),
           method = "spearman") -> glucose.corr.res

glucose.corr.res$n <- nrow(glucose.df)
glucose.corr.res$Week <- "0, 10, 13 (aggregated)"

3.5 Serum protein correlations

protein.df  %>% dplyr::rename(
  C_peptide_pg_ml = `C-Peptide_pg_ml`,
  GLP1_pM = `GLP-1_pM`,
  Insulin_uIU_ml = `Insulin_uIU/ml`
) -> protein.df

meta.slim %>%
  filter(Mouse %in% protein.df$Sample) %>%
  filter(Week==13) %>%
  dplyr::rename(Sample = Mouse) %>%
  right_join(protein.df, by = "Sample") %>%
  select(Label, Sample, C_peptide_pg_ml:PYY_pg_ml) %>% 
  left_join(FullRA.df, by = "Label") -> protein.micro.corr

protein.micro.corr %>% colnames()
##  [1] "Label"                                
##  [2] "Sample"                               
##  [3] "C_peptide_pg_ml"                      
##  [4] "Ghrelin_pg_ml"                        
##  [5] "GLP1_pM"                              
##  [6] "Insulin_uIU_ml"                       
##  [7] "Leptin_pg_ml"                         
##  [8] "PYY_pg_ml"                            
##  [9] "Bifidobacterium"                      
## [10] "Romboutsia"                           
## [11] "Lactococcus"                          
## [12] "Rikenella"                            
## [13] "Prevotellaceae UCG-001"               
## [14] "Rikenellaceae RC9 gut group"          
## [15] "Coriobacteriaceae UCG-002"            
## [16] "Prevotellaceae NK3B31 group"          
## [17] "Tyzzerella"                           
## [18] "Parasutterella"                       
## [19] "Lachnospiraceae UCG-001"              
## [20] "Muribaculum"                          
## [21] "Tuzzerella"                           
## [22] "Candidatus Arthromitus"               
## [23] "Acetatifactor"                        
## [24] "Marvinbryantia"                       
## [25] "Butyricicoccus"                       
## [26] "Candidatus Stoquefichus"              
## [27] "Lactobacillaceae"                     
## [28] "Bifidobacteriaceae"                   
## [29] "Peptostreptococcaceae"                
## [30] "Streptococcaceae"                     
## [31] "Prevotellaceae"                       
## [32] "Atopobiaceae"                         
## [33] "Sutterellaceae"                       
## [34] "Enterococcaceae"                      
## [35] "[Eubacterium] coprostanoligenes group"
## [36] "Erysipelatoclostridiaceae"            
## [37] "Bifidobacteriales"                    
## [38] "Peptostreptococcales-Tissierellales"  
## [39] "Oscillospirales"                      
## [40] "Burkholderiales"                      
## [41] "Clostridia UCG-014"                   
## [42] "Actinobacteria"                       
## [43] "Coriobacteriia"                       
## [44] "Gammaproteobacteria"                  
## [45] "Verrucomicrobiae"                     
## [46] "Actinobacteriota"                     
## [47] "Proteobacteria"                       
## [48] "Verrucomicrobiota"
protein.micro.corr %>%
  rstatix::cor_test(vars = c(C_peptide_pg_ml:PYY_pg_ml),
           vars2 = c("Bifidobacterium":"Verrucomicrobiota"),
           method = "spearman") -> protein.corr.res
protein.df %>% pivot_longer(
  cols = c(C_peptide_pg_ml:PYY_pg_ml),
  names_to = "var1",
  values_to = "value"
) %>% filter(is.na(value) == FALSE) %>% 
  dplyr::count(var1) -> protein.n

protein.n
protein.corr.res$Week <- "13"
protein.corr.res %>% left_join(protein.n, by = "var1") -> protein.corr.res

3.6 Immune cells

diff.meta.df %>%
  dplyr::filter(Intranasal_Treatment == "HDM") %>% 
  select(Label, Per_Eos, Per_Neut) %>%
  left_join(FullRA.df, by = "Label") %>%
  filter(is.na(Label) == FALSE) -> diff.corr

diff.corr %>% colnames()
##  [1] "Label"                                
##  [2] "Per_Eos"                              
##  [3] "Per_Neut"                             
##  [4] "Bifidobacterium"                      
##  [5] "Romboutsia"                           
##  [6] "Lactococcus"                          
##  [7] "Rikenella"                            
##  [8] "Prevotellaceae UCG-001"               
##  [9] "Rikenellaceae RC9 gut group"          
## [10] "Coriobacteriaceae UCG-002"            
## [11] "Prevotellaceae NK3B31 group"          
## [12] "Tyzzerella"                           
## [13] "Parasutterella"                       
## [14] "Lachnospiraceae UCG-001"              
## [15] "Muribaculum"                          
## [16] "Tuzzerella"                           
## [17] "Candidatus Arthromitus"               
## [18] "Acetatifactor"                        
## [19] "Marvinbryantia"                       
## [20] "Butyricicoccus"                       
## [21] "Candidatus Stoquefichus"              
## [22] "Lactobacillaceae"                     
## [23] "Bifidobacteriaceae"                   
## [24] "Peptostreptococcaceae"                
## [25] "Streptococcaceae"                     
## [26] "Prevotellaceae"                       
## [27] "Atopobiaceae"                         
## [28] "Sutterellaceae"                       
## [29] "Enterococcaceae"                      
## [30] "[Eubacterium] coprostanoligenes group"
## [31] "Erysipelatoclostridiaceae"            
## [32] "Bifidobacteriales"                    
## [33] "Peptostreptococcales-Tissierellales"  
## [34] "Oscillospirales"                      
## [35] "Burkholderiales"                      
## [36] "Clostridia UCG-014"                   
## [37] "Actinobacteria"                       
## [38] "Coriobacteriia"                       
## [39] "Gammaproteobacteria"                  
## [40] "Verrucomicrobiae"                     
## [41] "Actinobacteriota"                     
## [42] "Proteobacteria"                       
## [43] "Verrucomicrobiota"
diff.corr %>%
  rstatix::cor_test(vars = c(Per_Neut, Per_Eos),
           vars2 = c(`Bifidobacterium`:Verrucomicrobiota),
           method = "spearman") -> diff.corr.res

diff.corr.res$n <- nrow(diff.meta.df)
diff.corr.res$Week <- "13"

3.7 Trichrome and histopathology data

trichrome.df %>% dplyr::select(Mouse, Mean_Percent_Trichrome_Area_outliers_removed) %>%
  filter(is.na(Mean_Percent_Trichrome_Area_outliers_removed) == FALSE) %>% 
  distinct(.keep_all = TRUE) -> trichrome.mean

hist.df %>% dplyr::select(Mouse, HE_score, PAS_score) %>% 
  full_join(trichrome.mean, by = "Mouse") -> hist.tri

meta.slim %>% filter(Week == 13 & Mouse %in% hist.tri$Mouse) %>% 
  dplyr::select(Label, Mouse) %>% left_join(hist.tri, by = "Mouse") -> hist.tri.sam

FullRA.df %>% filter(Label %in% hist.tri.sam$Label) %>%
  right_join(hist.tri.sam, by = "Label") -> hist.tri.corr

hist.tri.corr %>% colnames()
##  [1] "Label"                                       
##  [2] "Bifidobacterium"                             
##  [3] "Romboutsia"                                  
##  [4] "Lactococcus"                                 
##  [5] "Rikenella"                                   
##  [6] "Prevotellaceae UCG-001"                      
##  [7] "Rikenellaceae RC9 gut group"                 
##  [8] "Coriobacteriaceae UCG-002"                   
##  [9] "Prevotellaceae NK3B31 group"                 
## [10] "Tyzzerella"                                  
## [11] "Parasutterella"                              
## [12] "Lachnospiraceae UCG-001"                     
## [13] "Muribaculum"                                 
## [14] "Tuzzerella"                                  
## [15] "Candidatus Arthromitus"                      
## [16] "Acetatifactor"                               
## [17] "Marvinbryantia"                              
## [18] "Butyricicoccus"                              
## [19] "Candidatus Stoquefichus"                     
## [20] "Lactobacillaceae"                            
## [21] "Bifidobacteriaceae"                          
## [22] "Peptostreptococcaceae"                       
## [23] "Streptococcaceae"                            
## [24] "Prevotellaceae"                              
## [25] "Atopobiaceae"                                
## [26] "Sutterellaceae"                              
## [27] "Enterococcaceae"                             
## [28] "[Eubacterium] coprostanoligenes group"       
## [29] "Erysipelatoclostridiaceae"                   
## [30] "Bifidobacteriales"                           
## [31] "Peptostreptococcales-Tissierellales"         
## [32] "Oscillospirales"                             
## [33] "Burkholderiales"                             
## [34] "Clostridia UCG-014"                          
## [35] "Actinobacteria"                              
## [36] "Coriobacteriia"                              
## [37] "Gammaproteobacteria"                         
## [38] "Verrucomicrobiae"                            
## [39] "Actinobacteriota"                            
## [40] "Proteobacteria"                              
## [41] "Verrucomicrobiota"                           
## [42] "Mouse"                                       
## [43] "HE_score"                                    
## [44] "PAS_score"                                   
## [45] "Mean_Percent_Trichrome_Area_outliers_removed"
hist.tri.corr %>%
  rstatix::cor_test(vars = c(HE_score:Mean_Percent_Trichrome_Area_outliers_removed),
           vars2 = c("Bifidobacterium":"Verrucomicrobiota"),
           method = "spearman") -> hist.corr.res


hist.corr.res$n <- nrow(hist.tri.corr)
hist.corr.res$Week <- "13"

4 Aggregate and correct for multiple testing

Weight.corr.res$Test <- "Weight"
glucose.corr.res$Test <- "Glucose tolerance testing"
diff.corr.res$Test <- "% immune cells; HDM"
protein.corr.res$Test <- "Serum proteins"
hist.corr.res$Test <- "Histology or % trichrome; HDM"

rbind(Weight.corr.res,
      glucose.corr.res,
      diff.corr.res,
      protein.corr.res,
      hist.corr.res) %>% adjust_pvalue(method = "BH") %>%
  filter(p.adj < 0.05) %>%
  add_significance() %>%
  mutate(p.adj.print = format(p.adj, digits = 2)) -> micro.corr.res
  
f.intersect %>% dplyr::select(row, Compare_taxon) %>% distinct() -> f.intersect.taxa

micro.corr.res %>% dplyr::rename(Health_metric = var1, row = var2, Metric_Type = Test) %>% 
  left_join(f.intersect.taxa, by = "row") %>%
  dplyr::rename(Taxonomic_Rank = Compare_taxon,
                Taxon = row) %>% 
  dplyr::mutate(Taxonomic_Rank = factor(Taxonomic_Rank, levels = c("Phylum", "Class", "Order", "Family", "Genus"))) %>% 
  arrange(Taxonomic_Rank, Taxon) -> micro.corr.res

write_csv(micro.corr.res, file.path(git.dir, "Microbiome_corr_res.csv"), append = FALSE, col_names = TRUE)
micro.corr.res
micro.corr.res %>% datatable()
micro.corr.res %>% dplyr::count(Taxon) %>% arrange(desc(n))

5 Graphs: all

f.ra %>% filter(OTU %in% f.intersect$row) %>% 
  dplyr::select(-Sample) %>% 
  dplyr::rename(taxon = OTU, RA = Abundance) %>% 
  mutate(Week_factor = factor(Week, levels = c("0", "10", "13")),
         Diet = str_replace(Diet, "Normal_Chow", "Normal Chow"),
         Diet = str_replace(Diet, "High_Fat_Diet", "High Fat Diet"),
         Diet = factor(Diet, levels = c("Normal Chow", "High Fat Diet")),
         Label = as.character(Label)) -> f.ra.graph
f.ra.graph
Weight.corr.df %>% dplyr::select(Label, Weight) %>% 
  right_join(f.ra.graph, by = "Label") -> f.ra.graph.corr
head(f.ra.graph.corr)
corr.df %>% select(Label, AUC:Glucose_min_0) %>%
  right_join(f.ra.graph.corr, by = "Label") -> f.ra.graph.corr

protein.micro.corr %>% select(Label, C_peptide_pg_ml:PYY_pg_ml) %>%
  right_join(f.ra.graph.corr, by = "Label") -> f.ra.graph.corr

hist.tri.sam %>% select(Label, HE_score:Mean_Percent_Trichrome_Area_outliers_removed) %>%
  right_join(f.ra.graph.corr, by = "Label") -> f.ra.graph.corr

diff.meta.df %>% select(Label, Per_Eos:Per_Neut) %>%
  right_join(f.ra.graph.corr, by = "Label") -> f.ra.graph.corr
head(f.ra.graph.corr)
micro.corr.res
for (i in 1:length(micro.corr.res$Health_metric)) {
  yvar <- micro.corr.res$Health_metric[i]
  q <- micro.corr.res$p.adj.print[i]
  value <- micro.corr.res$cor[[i]]
  str_glue("p = ", q, "\n Cor. coeff = ", value) -> label_corr

  f.ra.graph.corr %>%
    filter(taxon == micro.corr.res$Taxon[i]) %>%
    ggplot(aes(x = RA, y = .data[[yvar]])) + 
      geom_point() +
      labs(y= str_glue(micro.corr.res$Health_metric[i]),
           x= str_glue(micro.corr.res$Taxon[i], " ","Relative Abundance"),
           title = str_glue("Spearman's rank correlation, ", yvar, 
                            " & ", micro.corr.res$Taxon[i])) +
      theme_bw()+ expand_limits(y = 0) +
      geom_text(aes(label = label_corr), x = Inf, y = Inf, vjust = "inward", hjust = "inward", family = "sans") -> p
      print(p)
}
## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 136 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 144 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 144 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 144 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 144 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 136 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

## Warning: Removed 3 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 84 rows containing missing values (`geom_point()`).

## Warning: Removed 140 rows containing missing values (`geom_point()`).

6 Specific taxa

Based on literature review, focus on Lachnospiraceae UCG-001, Bifidobacterium, and Parasutterella.

f.ra.graph %>% 
  filter(taxon == "Lachnospiraceae UCG-001" & Surgery == "None") %>%  # no surgery group!!!! 
  ggplot(aes(x = Week_factor, y = RA)) + 
  geom_boxplot(width=0.2) + geom_point(size = 1) +  
  facet_wrap(vars(Diet))  + 
  labs(y="*Lachnospiraceae UCG-001* Relative Abundance", x = "Week") + 
  theme_bw(base_size = 14)+ 
  theme(axis.title.y = element_markdown()) -> fig.lachno.rel

fig.lachno.rel

f.ra.graph %>% 
  filter(taxon == "Bifidobacterium" & Surgery == "None") %>%  # no surgery group!!!! 
  ggplot(aes(x = Week_factor, y = RA)) + 
  geom_boxplot(width=0.2) + geom_point(size = 1) +  
  facet_wrap(vars(Diet))  + 
  labs(y="*Bifidobacterium* Relative Abundance", x = "Week") + 
  theme_bw(base_size = 14) + 
  theme(axis.title.y = element_markdown())-> fig.Bifido.rel

fig.Bifido.rel

f.ra.graph %>% 
  filter(taxon == "Parasutterella" & Surgery == "None") %>%  # no surgery group!!!! 
  ggplot(aes(x = Week_factor, y = RA)) + 
  geom_boxplot(width=0.2) + geom_point(size = 1) +  
  facet_wrap(vars(Diet))  + 
  labs(y="*Parasutterella* Relative Abundance", x = "Week") + 
  theme_bw(base_size = 14) + 
  theme(axis.title.y = element_markdown()) -> fig.para.rel

fig.para.rel

micro.corr.res %>% 
  filter(Taxon == "Lachnospiraceae UCG-001" | Taxon == "Bifidobacterium" | Taxon == "Parasutterella") %>%
  arrange(cor)
protein.micro.corr %>%
    ggplot(aes(x = `Lachnospiraceae UCG-001`, y = GLP1_pM)) + 
      geom_point() +
      labs(y= "GLP-1 (pM)",
           x= "*Lachnospiraceae UCG-001* Relative Abundance") +
      theme_bw(base_size = 14)+ expand_limits(y = 0) + 
  theme(axis.title.x = element_markdown())-> fig.lachno.glp1
fig.lachno.glp1

protein.micro.corr %>%
    ggplot(aes(x = Bifidobacterium, y = GLP1_pM)) + 
      geom_point() +
      labs(y= "GLP-1 (pM)",
           x= "*Bifidobacterium* Relative Abundance") +
      theme_bw(base_size = 14)+ expand_limits(y = 0)+ 
  theme(axis.title.x = element_markdown())  -> fig.bifido.glp1

fig.bifido.glp1

protein.micro.corr %>%
    ggplot(aes(x = Parasutterella, y = GLP1_pM)) + 
      geom_point() +
      labs(y= "GLP-1 (pM)",
           x= "*Parasutterella* Relative Abundance") +
      theme_bw(base_size = 14)+ expand_limits(y = 0) + 
  theme(axis.title.x = element_markdown()) -> fig.para.glp1

fig.para.glp1

plot_grid(
  fig.Bifido.rel, fig.bifido.glp1,
  fig.lachno.rel, fig.lachno.glp1,
  fig.para.rel, fig.para.glp1,
  labels = "AUTO", ncol = 2, label_size = 20, scale = 0.9,
  rel_widths = c(1.3, 1)
) -> fig.corr
fig.corr

ragg::agg_jpeg(file.path(fig.dir, "fecal_corr.jpeg"), width = 12, height = 12, units = "in", res = 600)
fig.corr
dev.off()
## png 
##   2

7 Reproducibility

sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ggtext_0.1.2       cowplot_1.1.1      Hmisc_5.1-1        phyloseq_1.44.0   
##  [5] stringr_1.5.0      tibble_3.2.1       readr_2.1.4        tidyr_1.3.0       
##  [9] dplyr_1.1.3        plotly_4.10.3      DT_0.29            ggrepel_0.9.4     
## [13] RColorBrewer_1.1-3 pals_1.8           ggpubr_0.6.0       ggplot2_3.4.4     
## [17] rstatix_0.7.2     
## 
## loaded via a namespace (and not attached):
##   [1] rstudioapi_0.15.0       jsonlite_1.8.7          magrittr_2.0.3         
##   [4] farver_2.1.1            rmarkdown_2.25          zlibbioc_1.46.0        
##   [7] ragg_1.2.6              vctrs_0.6.4             multtest_2.56.0        
##  [10] RCurl_1.98-1.13         base64enc_0.1-3         htmltools_0.5.6.1      
##  [13] broom_1.0.5             Rhdf5lib_1.22.1         Formula_1.2-5          
##  [16] rhdf5_2.44.0            sass_0.4.7              bslib_0.5.1            
##  [19] htmlwidgets_1.6.2       plyr_1.8.9              cachem_1.0.8           
##  [22] commonmark_1.9.0        igraph_1.5.1            lifecycle_1.0.3        
##  [25] iterators_1.0.14        pkgconfig_2.0.3         Matrix_1.6-1.1         
##  [28] R6_2.5.1                fastmap_1.1.1           GenomeInfoDbData_1.2.10
##  [31] digest_0.6.33           colorspace_2.1-0        S4Vectors_0.38.2       
##  [34] textshaping_0.3.7       crosstalk_1.2.0         vegan_2.6-4            
##  [37] labeling_0.4.3          fansi_1.0.5             httr_1.4.7             
##  [40] abind_1.4-5             mgcv_1.9-0              compiler_4.3.1         
##  [43] bit64_4.0.5             withr_2.5.1             htmlTable_2.4.2        
##  [46] backports_1.4.1         carData_3.0-5           maps_3.4.1             
##  [49] ggsignif_0.6.4          MASS_7.3-60             biomformat_1.28.0      
##  [52] permute_0.9-7           tools_4.3.1             foreign_0.8-85         
##  [55] ape_5.7-1               nnet_7.3-19             glue_1.6.2             
##  [58] nlme_3.1-163            rhdf5filters_1.12.1     gridtext_0.1.5         
##  [61] grid_4.3.1              checkmate_2.3.0         cluster_2.1.4          
##  [64] reshape2_1.4.4          ade4_1.7-22             generics_0.1.3         
##  [67] gtable_0.3.4            tzdb_0.4.0              data.table_1.14.8      
##  [70] hms_1.1.3               xml2_1.3.5              car_3.1-2              
##  [73] utf8_1.2.4              XVector_0.40.0          BiocGenerics_0.46.0    
##  [76] foreach_1.5.2           pillar_1.9.0            markdown_1.11          
##  [79] vroom_1.6.4             splines_4.3.1           lattice_0.22-5         
##  [82] survival_3.5-7          bit_4.0.5               tidyselect_1.2.0       
##  [85] Biostrings_2.68.1       knitr_1.44              gridExtra_2.3          
##  [88] IRanges_2.34.1          stats4_4.3.1            xfun_0.40              
##  [91] Biobase_2.60.0          stringi_1.7.12          lazyeval_0.2.2         
##  [94] yaml_2.3.7              evaluate_0.22           codetools_0.2-19       
##  [97] cli_3.6.1               rpart_4.1.21            systemfonts_1.0.5      
## [100] munsell_0.5.0           jquerylib_0.1.4         dichromat_2.0-0.1      
## [103] Rcpp_1.0.11             GenomeInfoDb_1.36.4     mapproj_1.2.11         
## [106] ellipsis_0.3.2          bitops_1.0-7            viridisLite_0.4.2      
## [109] scales_1.2.1            purrr_1.0.2             crayon_1.5.2           
## [112] rlang_1.1.1